Topic Identification in Soft Clustering using PCA and ICA

نویسندگان

  • Leonid Zhukov
  • David Gleich
  • Harvey Mudd
چکیده

Many applications can benefit from soft clustering, where each datum is assigned to multiple clusters with membership weights that sum to one. In this paper we present a comparison of principal component analysis (PCA) and independent component analysis (ICA) when used for soft clustering. We provide a short mathematical background for these methods and demonstrate their application to a sponsored links search listings dataset. We present examples of the soft clusters generated by both methods and compare the results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Soft Clustering with Projections: PCA, ICA, and Laplacian

In this paper we present a comparison of three projection methods that use the eigenvectors of a matrix to investigate high-dimensional dataset: principal component analysis (PCA), principal component analysis followed by independent component analysis (PCA+ICA), and Laplacian projections. We demonstrate the application of these methods to a sponsored links search listings dataset and provide a...

متن کامل

Study of Multivariate Data Clustering Based on K-Means and Independent Component Analysis

For last two decades, clustering is well-recognized area in the research field of data mining. Data clustering plays the major research at pattern recognition, Signal processing, bioinformatics and Artificial Intelligence. Clustering process is an unsupervised learning techniques where it generates a group of object based on their similarity in such a way that the objects belonging to other gro...

متن کامل

The Independent and Principal Component of Graph Spectra

In this paper, we demonstrate how PCA and ICA can be used for embedding graphs in pattern-spaces. Graph spectral feature vectors are calculated from the leading eigenvalues and eigenvectors of the unweighted graph adjacency matrix. The vectors are then embedded in a lower dimensional pattern space using both the PCA and ICA decomposition methods. Synthetic and real sequences are tested using th...

متن کامل

Comparison of MLP NN Approach with PCA and ICA for Extraction of Hidden Regulatory Signals in Biological Networks

The biologists now face with the masses of high dimensional datasets generated from various high-throughput technologies, which are outputs of complex inter-connected biological networks at different levels driven by a number of hidden regulatory signals. So far, many computational and statistical methods such as PCA and ICA have been employed for computing low-dimensional or hidden represe...

متن کامل

Sparse ICA via cluster-wise PCA

In this paper, it is shown that independent component analysis (ICA) of sparse signals (sparse ICA) can be seen as a cluster-wise principal component analysis (PCA). Consequently, Sparse ICA may be done by a combination of a clustering algorithm and PCA. For the clustering part, we use, in this paper, an algorithm inspired from K-means. The final algorithm is easy to implement for any number of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004